Synthesizing breathiness in natural speech with sinusoidal modelling
نویسندگان
چکیده
This paper discusses recent work in synthesizing a breathy quality in pre-recorded speech, which has applications in voice morphing and concatenative TTS. Previous work has shown that the breathy quality in speech is characterized in part by the presence of random noise in the upper region of the spectrum [1]. The sinusoidal modelling representation of speech facilitates making high-quality modifications to speech signals as well as modifying regions of the spectrum independently. We use sinusoidal modelling, along with techniques borrowed from analog communication systems to simulate aspiration noise in wideband speech signals above some lower cutoff frequency. Specifically, we use techniques based on amplitude modulation (AM) and phase modulation (PM), with the harmonics from the sinusoidal model of speech as carriers and lowpass random noise as the message signal. Formal listening tests were conducted and listeners rated the synthesized effect as “breathy” more often than in natural non-breathy speech, but significantly less often than in naturally breathy speech.
منابع مشابه
A singing voice synthesis system based on sinusoidal modeling
Although sinusoidal models have been demonstrated to be capable of high-quality musical instrument synthesis [1], speech modi cation [2], and speech synthesis [3], little exploration of the application of these models to the synthesis of singing voice has been undertaken. In this paper, we propose a system framework similar to that employed in concatenation-based text-to-speech synthesizers, an...
متن کاملExponential sinusoidal modeling of transitional speech segments
A generalized sinusoidal model for speech signal processing is studied. The main feature of the model is that the amplitude of each sinusoidal component is allowed to vary exponentially with time. We propose to use the model in transitional speech segments such as speech onsets and voiced/unvoiced transitions. Computer simulations with natural speech signals indicate substantial better modeling...
متن کاملAnalysis/synthesis and modification of the speech aperiodic component
The general framework of this paper is speech analysis and synthesis. The speech signal may be separated into two components: (1) a periodic component (which includes the quasi-periodic or voiced sounds produced by regular vocal cord vibrations); (2) an aperiodic component (which includes the non-periodic part of voiced sounds (e.g. fricative noise in /v/j or sound emitted without any vocal cor...
متن کاملAcoustic correlates of breathy vocal quality.
The purpose of this study was to evaluate the effectiveness of several acoustic measures in predicting breathiness ratings. Recordings were made of eight normal men and seven normal women producing normally phonated, moderately breathy, and very breathy sustained vowels. Twenty listeners rated the degree of breathiness using a direct magnitude estimation procedure. Acoustic measures were made o...
متن کاملAcoustic correlates of breathy vocal quality: dysphonic voices and continuous speech.
In an earlier study, we evaluated the effectiveness of several acoustic measures in predicting breathiness ratings for sustained vowels spoken by nonpathological talkers who were asked to produce nonbreathy, moderately breathy, and very breathy phonation (Hillenbrand, Cleveland, & Erickson, 1994). The purpose of the present study was to extend these results to speakers with laryngeal pathologie...
متن کامل